73 research outputs found

    parallelMCMCcombine: An R Package for Bayesian Methods for Big Data and Analytics

    Full text link
    Recent advances in big data and analytics research have provided a wealth of large data sets that are too big to be analyzed in their entirety, due to restrictions on computer memory or storage size. New Bayesian methods have been developed for large data sets that are only large due to large sample sizes; these methods partition big data sets into subsets, and perform independent Bayesian Markov chain Monte Carlo analyses on the subsets. The methods then combine the independent subset posterior samples to estimate a posterior density given the full data set. These approaches were shown to be effective for Bayesian models including logistic regression models, Gaussian mixture models and hierarchical models. Here, we introduce the R package parallelMCMCcombine which carries out four of these techniques for combining independent subset posterior samples. We illustrate each of the methods using a Bayesian logistic regression model for simulation data and a Bayesian Gamma model for real data; we also demonstrate features and capabilities of the R package. The package assumes the user has carried out the Bayesian analysis and has produced the independent subposterior samples outside of the package. The methods are primarily suited to models with unknown parameters of fixed dimension that exist in continuous parameter spaces. We envision this tool will allow researchers to explore the various methods for their specific applications, and will assist future progress in this rapidly developing field.Comment: for published version see: http://www.plosone.org/article/fetchObject.action?uri=info%3Adoi%2F10.1371%2Fjournal.pone.0108425&representation=PD

    Asymptotic properties and approximation of Bayesian logspline density estimators for communication-free parallel computing methods

    Full text link
    In this article we perform an asymptotic analysis of Bayesian parallel density estimators which are based on logspline density estimation. The parallel estimator we introduce is in the spirit of a kernel density estimator introduced in recent studies. We provide a numerical procedure that produces the density estimator itself in place of the sampling algorithm. We then derive an error bound for the mean integrated squared error for the full data posterior density estimator. We also investigate the parameters that arise from logspline density estimation and the numerical approximation procedure. Our investigation identifies specific choices of parameters for logspline density estimation that result in the error bound scaling appropriately in relation to these choices.Comment: 33 pages, 11 figure

    Lying About Terrorism

    Get PDF
    Conventional wisdom holds that terrorism is committed for strategic reasons as a form of costly signaling to an audience. However, since over half of terrorist attacks are not credibly claimed, conventional wisdom does not explain many acts of terrorism. This article suggests that there are four lies about terrorism that can be incorporated in a rationalist framework: false claiming, false flag, the hot-potato problem, and the lie of omission. Each of these lies about terrorism can be strategically employed to help a group achieve its desired goal(s) without necessitating that an attack be truthfully claimed

    RNA-seq in the tetraploid Xenopus laevis enables genome-wide insight in a classic developmental biology model organism

    Get PDF
    Advances in sequencing technology have significantly advanced the landscape of developmental biology research. The dissection of genetic networks in model and nonmodel organisms has been greatly enhanced with high-throughput sequencing technologies. RNA-seq has revolutionized the ability to perform developmental biology research in organisms without a published genome sequence. Here, we describe a protocol for developmental biologists to perform RNA-seq on dissected tissue or whole embryos. We start with the isolation of RNA and generation of sequencing libraries. We further show how to interpret and analyze the large amount of sequencing data that is generated in RNA-seq. We explore the abilities to examine differential expression, gene duplication, transcript assembly, alternative splicing and SNP discovery. For the purposes of this article, we use Xenopus laevis as the model organism to discuss uses of RNA-seq in an organism without a fully annotated genome sequence

    Xenopus: An emerging model for studying congenital heart disease

    Get PDF
    Congenital heart defects affect nearly 1% of all newborns and are a significant cause of infant death. Clinical studies have identified a number of congenital heart syndromes associated with mutations in genes that are involved in the complex process of cardiogenesis. The African clawed frog, Xenopus, has been instrumental in studies of vertebrate heart development and provides a valuable tool to investigate the molecular mechanisms underlying human congenital heart diseases. In this review, we discuss the methodologies that make Xenopus an ideal model system to investigate heart development and disease. We also outline congenital heart conditions linked to cardiac genes that have been well-studied in Xenopus and describe some emerging technologies that will further aid in the study of these complex syndromes
    • …
    corecore